Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OpenAI Whisper API for Enhanced Transcription Support #145

Conversation

agentmarketbot
Copy link
Contributor

Pull Request Description

Title: Support Additional Whisper Services - Integration of OpenAI Whisper API

Background

This pull request addresses issue #137, which focuses on enhancing the flexibility and performance of our Telegram bot by supporting additional transcription services. The current implementation utilizes AWS Whisper for transcribing voice messages. With this update, we introduce support for the OpenAI Whisper API as an alternative transcription service.

Changes Implemented

  1. Integration of OpenAI Whisper API

    • A new class named OpenAITranscriber has been created to enable transcription using OpenAI's Whisper API. This class encapsulates the functionality necessary for interacting with the OpenAI service.
  2. Modification of Existing Classes

    • The AudioTranscriber class has been updated to accommodate both AWS and OpenAI transcription services. This change ensures that users have the flexibility to choose their preferred transcription backend.
  3. Configuration Changes

    • The config.py file has been modified to include:
      • An entry for OPENAI_API_KEY, which is required when using the OpenAI service.
      • A new option, TRANSCRIPTION_SERVICE, which defaults to 'aws'. Users can change this setting to 'openai' to utilize the OpenAI transcription service.
  4. Updates to Bot Handling

    • The bot_handlers.py file has been updated to initialize the correct transcription service based on the user's configuration. This change ensures the bot can dynamically select between AWS and OpenAI services seamlessly.

Usage Guidelines

To switch between the AWS and OpenAI transcription services, users should set the following environment variables:

  • TRANSCRIPTION_SERVICE: Set to 'aws' (default) or 'openai' depending on the desired transcription service.
  • OPENAI_API_KEY: Required when selecting the OpenAI service to enable successful transcription requests.

Conclusion

This implementation maintains backward compatibility with the existing AWS service as the default option while offering enhanced flexibility through the addition of OpenAI's Whisper API. These changes aim to improve performance and cost-effectiveness to better meet user needs.

Please review the changes and let me know if further adjustments or enhancements are needed!

Add alternative audio transcription service using OpenAI's Whisper API 
alongside existing AWS transcription. Key changes include:

- Create new OpenAITranscriber class to handle Whisper API requests
- Modify AudioTranscriber to support both AWS and OpenAI services
- Add configuration options for transcription service selection
- Add OPENAI_API_KEY and TRANSCRIPTION_SERVICE env variables
- Make AWS services optional when using OpenAI transcription

The system now defaults to AWS but can be switched to OpenAI's Whisper
via the TRANSCRIPTION_SERVICE environment variable ('aws' or 'openai').
@vadanrod14 vadanrod14 closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants